ML-Hierarchical Clustering
Produce a nested sequence of clusters, a tree, also called Dendrogram.
Type of Hierarchical Clustering
Agglomerative (bottom up) clustering: It builds the dendrogram (tree) from the bottom level, and
- merges the most similar (or nearest) pair of clusters
- stops when all the data points are merged into a single cluster (i.e., the root cluster).
Divisive (top down) clustering: It starts with all data points in one cluster, the root.
- Splits the root into a set of child clusters. Each child cluster is recursively divided further
- stops when only singleton clusters of individual data points remain, i.e., each cluster with only a single point
Pros and Cons
Pros:
- 实现简单,容易理解
Cons:
- 合并点/分裂点选择不太容易
- 合并/分裂的操作不能进行撤销
- 大数据集不太合适
- 执行效率较低
, 为迭代次数, 为样本点数